{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "view-in-github"
   },
   "source": [
    "<a href=\"https://colab.research.google.com/github/peremartra/Large-Language-Model-Notebooks-Course/blob/main/3-LangChain/3_2_LLAMA2_Moderation_Chat.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gC-aV7J4K_uR",
    "jp-MarkdownHeadingCollapsed": true
   },
   "source": [
    "<div>\n",
    "    <h1>Large Language Models Projects</a></h1>\n",
    "    <h3>Apply and Implement Strategies for Large Language Models</h3>\n",
    "    <h2>3.2-Create a Moderation System using LangChain.</h2>\n",
    "    <h3>LLAMA 2 version</h3>\n",
    "  </div>\n",
    "\n",
    "by [Pere Martra](https://www.linkedin.com/in/pere-martra/)\n",
    "\n",
    "________\n",
    "Model: Llama-2-7b-chat-hf / meta-llama/Meta-Llama-3.1-8B-Instruct\n",
    "\n",
    "Colab Environment: GPU-T4 High-RAM. (If you wwant to test Llama-3.1 you'll need a bigger GPU).\n",
    "\n",
    "Keys:\n",
    "* Lagchain\n",
    "* Llama\n",
    "* Moderator.\n",
    "\n",
    "Related article: [Create a Self-Moderated Comment System with LLAMA-2 and LangChain](https://levelup.gitconnected.com/create-a-self-moderated-comment-system-with-llama-2-and-langchain-656f482a48be)\n",
    "_______________"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "gIV3BBY_LULN"
   },
   "source": [
    "# How To Create a Moderation System Using LangChain & Hugging Face.\n",
    "\n",
    "We are going to create a moderation system based on two models.\n",
    "\n",
    "The first model reads user comments and generates responses.\n",
    "\n",
    "The second language model then analyzes the generated response, identifying any negativity and modifying the response if necessary.\n",
    "\n",
    "This process aims to prevent negative or inappropriate user input from triggering a similarly negative or off-tone response from the comment system."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "oI5hcVHoK-Ol",
    "outputId": "47191060-8d4c-4cac-d83c-cce92f9efaff"
   },
   "outputs": [],
   "source": [
    "#Install de LangChain and openai libraries.\n",
    "!pip install -q langchain==0.2.11\n",
    "!pip install -q transformers==4.43.1\n",
    "!pip install -q accelerate==0.32.1\n",
    "!pip install -q langchain-community==0.2.10\n",
    "!pip install -q huggingface_hub  #==0.24.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 304
    },
    "id": "cKgYdJv7qPGs",
    "outputId": "773403fe-2cf4-4812-e0fe-d997f25866ab"
   },
   "outputs": [],
   "source": [
    "from getpass import getpass\n",
    "hf_key = getpass(\"Hugging Face Key: \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Q5AkbmVRp6Iq"
   },
   "outputs": [],
   "source": [
    "!huggingface-cli login --token $hf_key"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1R1QSNGpOYCK"
   },
   "source": [
    "## Importing LangChain Libraries.\n",
    "* PrompTemplate: provides functionality to create prompts with parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "CuBxdjshLYDM"
   },
   "outputs": [],
   "source": [
    "from langchain import PromptTemplate\n",
    "from langchain_core.output_parsers import StrOutputParser\n",
    "\n",
    "from langchain.llms import HuggingFacePipeline\n",
    "import transformers\n",
    "from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "AIfbM8wcOh1h"
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "from torch import cuda"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "BoYi2-wHfZrV"
   },
   "outputs": [],
   "source": [
    "#In a MAC Silicon the device must be 'mps'\n",
    "# device = torch.device('mps') #to use with MAC Silicon\n",
    "device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "pLvC0ENnfiHU"
   },
   "outputs": [],
   "source": [
    "device"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "axOsUJ79MMrr"
   },
   "source": [
    "##Load the Model ."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "D6UOQjyJP9rb"
   },
   "outputs": [],
   "source": [
    "#You can try with any llama model, but you will need more GPU and memory as you\n",
    "#increase the size of the model.\n",
    "model_id = \"meta-llama/Llama-2-7b-chat-hf\"\n",
    "#model_id = \"meta-llama/Meta-Llama-3.1-8B-Instruct\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ONwHLj5Lwfde"
   },
   "outputs": [],
   "source": [
    "# begin initializing HF items, need auth token for these\n",
    "model_config = transformers.AutoConfig.from_pretrained(\n",
    "    model_id,\n",
    "    token=hf_key\n",
    ")\n",
    "\n",
    "model = AutoModelForCausalLM.from_pretrained(\n",
    "    model_id,\n",
    "    trust_remote_code=True,\n",
    "    config=model_config,\n",
    "    device_map='auto',\n",
    "    token=hf_key\n",
    ")\n",
    "model.eval()\n",
    "print(f\"Model loaded on {device}\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "uDNKqttDoD4a"
   },
   "outputs": [],
   "source": [
    "tokenizer = AutoTokenizer.from_pretrained(model_id,\n",
    "                                          use_aut_token=hf_key)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "trfF_dcKqyXj"
   },
   "outputs": [],
   "source": [
    "pipe = pipeline(\n",
    "    \"text-generation\",\n",
    "    model=model,\n",
    "    tokenizer=tokenizer,\n",
    "    max_new_tokens=128,\n",
    "    temperature=0.1,\n",
    "    #do_sample=False,\n",
    "    top_p=0,\n",
    "    #trust_remote_code=True,\n",
    "    eos_token_id=tokenizer.eos_token_id,\n",
    "    repetition_penalty=1.1,\n",
    "    return_full_text=True,\n",
    "    device_map='auto'\n",
    ")\n",
    "\n",
    "assistant_llm = HuggingFacePipeline(pipeline=pipe)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UluKkDpVTb8W"
   },
   "source": [
    "## Create the template for the first model called assistant.\n",
    "\n",
    "The prompt receives 2 variables, the sentiment and the customer_request, or customer comment.\n",
    "\n",
    "I included the sentiment to facilitate the creation of rude or incorrect answers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "kN5ArR3DQum0"
   },
   "outputs": [],
   "source": [
    "# Instruction how the LLM must respond the comments,\n",
    "assistant_template = \"\"\"\n",
    "[INST]<<SYS>>You are {sentiment} assistant that responds to user comments,\n",
    "using similar vocabulary than the user.\n",
    "Stop answering text after answer the first user.<</SYS>>\n",
    "\n",
    "User comment:{customer_request}[/INST]\n",
    "assistant_response\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "A5kulgS3Tg-6"
   },
   "outputs": [],
   "source": [
    "#Create the prompt template to use in the Chain for the first Model.\n",
    "assistant_prompt_template = PromptTemplate(\n",
    "    input_variables=[\"sentiment\", \"customer_request\"],\n",
    "    template=assistant_template\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kqscrfkaTlze"
   },
   "source": [
    "Now we create a First Chain. Just chaining the ***assistant_prompt_template*** and the model. The model will receive the prompt generated with the prompt_template."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "t-TojsLETi1I"
   },
   "outputs": [],
   "source": [
    "output_parser = StrOutputParser()\n",
    "assistant_chain = assistant_prompt_template | assistant_llm | output_parser"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "dBIymqlCTq4l"
   },
   "source": [
    "To execute the chain created it's necessary to call the ***.run*** method of the chain, and pass the variables necessaries.\n",
    "\n",
    "In our case: ***customer_request*** and ***sentiment***."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ZWdmIxBGTo-S"
   },
   "outputs": [],
   "source": [
    "#Support function to obtain a response to a user comment.\n",
    "def create_dialog(customer_request, sentiment):\n",
    "    #callint the .invoke method from the chain created Above.\n",
    "    assistant_response = assistant_chain.invoke(\n",
    "        {\"customer_request\": customer_request,\n",
    "        \"sentiment\": sentiment}\n",
    "    )\n",
    "    return assistant_response"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "FOlZymNDTuk1"
   },
   "source": [
    "## Obtain answers from our first Model Unmoderated.\n",
    "\n",
    "The customer post is really rude, we are looking for a rude answer from our Model. To obtain it we are changing the ***sentiment*** variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "ZJMH8CEuTtUA"
   },
   "outputs": [],
   "source": [
    "# This the customer request, or customere comment in the forum moderated by the agent.\n",
    "# feel free to update it.\n",
    "customer_request = \"\"\"Your product is a piece of shit. I want my money back!\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "pJneR3CsTzhB"
   },
   "outputs": [],
   "source": [
    "# Our assistatnt working in 'nice' mode.\n",
    "assistant_response=create_dialog(customer_request, \"nice\")\n",
    "print(assistant_response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "KznRBMbeT2K2"
   },
   "outputs": [],
   "source": [
    "#Our assistant running in rude mode.\n",
    "assistant_response = create_dialog(customer_request, \"most rude assistant that exist\")\n",
    "print(assistant_response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "sVPa-bl3MSbT"
   },
   "source": [
    "Okay, this answer needs some moderation! Fortunately, we are actively working on it!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "O9nzBDprNytc"
   },
   "source": [
    "## Moderator\n",
    "Let's create the second moderator. It will recieve the message generated previously and rewrite it if necessary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "KebcVF5BVWdF"
   },
   "outputs": [],
   "source": [
    "#The moderator prompt template\n",
    "moderator_template = \"\"\"\n",
    "[INST]<<SYS>>You are the moderator of an online forum, you are strict and will not tolerate any negative comments.\n",
    "You will receive an original comment and if it is impolite you must transform into polite.\n",
    "Try to mantain the meaning when possible.<</SYS>>\n",
    "\n",
    "Original comment: {comment_to_moderate}/[INST]\n",
    "\"\"\"\n",
    "\n",
    "# We use the PromptTemplate class to create an instance of our template that will use the prompt from above and store variables we will need to input when we make the prompt.\n",
    "moderator_prompt_template = PromptTemplate(\n",
    "    input_variables=[\"comment_to_moderate\"],\n",
    "    template=moderator_template\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "KDh0Ik3zN6um"
   },
   "outputs": [],
   "source": [
    "moderator_llm = assistant_llm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "BLUUW-xxOAli"
   },
   "outputs": [],
   "source": [
    "#We build the chain for the moderator.\n",
    "\n",
    "moderator_chain = moderator_prompt_template | moderator_llm | output_parser"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "GB8LlEjLVZI6"
   },
   "outputs": [],
   "source": [
    "assistant_response"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "3qZRGTrIT7KE"
   },
   "outputs": [],
   "source": [
    "# To run our chain we use the .invoke() command\n",
    "moderator_says = moderator_chain.invoke({\"comment_to_moderate\": assistant_response})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "s4x37oVAZa9d"
   },
   "outputs": [],
   "source": [
    "print(moderator_says)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Azfj6yzXV_ae"
   },
   "source": [
    "This answer is more polite that the one produce by the  **\"rude\" assistant**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1GtiMkHuWEhH"
   },
   "source": [
    "## LangChain System\n",
    "Now is Time to put both models in the same Chain and that they act as if they were a sigle model.\n",
    "\n",
    "We have both models, amb prompt templates, we only need to create a new chain and see hot it works.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Daen3DEWWNEq"
   },
   "source": [
    "\n",
    "It's necessary to indicate the chains and the parameters that we shoud pass in the **.run** method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "dXcC6OsJWJSE"
   },
   "outputs": [],
   "source": [
    "assistant_moderated_chain = (\n",
    "    {\"comment_to_moderate\":assistant_chain}\n",
    "    |moderator_chain\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "g-Xhe1nLWSTY"
   },
   "source": [
    "Lets use our Moderating System!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "dsFnDEv8WQVG"
   },
   "outputs": [],
   "source": [
    "# We can now run the chain.\n",
    "from langchain.callbacks.tracers import ConsoleCallbackHandler\n",
    "assistant_moderated_chain.invoke({\"sentiment\": \"very rude\", \"customer_request\": customer_request},\n",
    "                                 config={'callbacks':[ConsoleCallbackHandler()]})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "m_Z4emYLOQUR"
   },
   "source": [
    "## Conclusions\n",
    "As You can see the moderator changes the answer of our assistant. The one produced by the moderator is by far more polite than the original response created by the rude assistant.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Y1CYas_AObDJ"
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "gpuType": "T4",
   "include_colab_link": true,
   "machine_shape": "hm",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
